Logistic Regression and Boosting for Labeled Bags of Instances

نویسندگان

Xin Xu

Eibe Frank

چکیده

In this paper we upgrade linear logistic regression and boosting to multi-instance data, where each example consists of a labeled bag of instances. This is done by connecting predictions for individual instances to a bag-level probability estimate by simple averaging and maximizing the likelihood at the bag level—in other words, by assuming that all instances contribute equally and independently to a bag’s label. We present empirical results for artificial data generated according to the underlying generative model that we assume, and also show that the two algorithms produce competitive results on the Musk benchmark datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reducing Annotation Effort using Generalized Expectation Criteria

Generalized expectation (GE) criteria [McCallum et al., 2007] are terms in objective functions that assign scores to values of model expectations. In this paper we introduce GE-FL, a method that uses GE to train a probabilistic model using associations between input features and classes rather than complete labeled instances. Specifically, here the expectations are model predicted class distrib...

متن کامل

Multiple Instance Metric Learning from Automatically Labeled Bags of Faces

Metric learning aims at finding a distance that approximates a task-specific notion of semantic similarity. Typically, a Mahalanobis distance is learned from pairs of data labeled as being semantically similar or not. In this paper, we learn such metrics in a weakly supervised setting where “bags” of instances are labeled with “bags” of labels. We formulate the problem as a multiple instance le...

متن کامل

Multiple Instance Learning with Query Bags

In many machine learning applications, precisely labeled data is either burdensome or impossible to collect. Multiple Instance Learning (MIL), in which training data is provided in the form of labeled bags rather than labeled instances, is one approach for dealing with ambiguously labeled data. In this paper we argue that in many applications of MIL (e.g. image, audio, text, bioinformatics) a s...

متن کامل

Review of Multi-Instance Learning and Its applications

Multiple Instance Learning (MIL) is proposed as a variation of supervised learning for problems with incomplete knowledge about labels of training examples. In supervised learning, every training instance is assigned with a discrete or real-valued label. In comparison, in MIL the labels are only assigned to bags of instances. In the binary case, a bag is labeled positive if at least one instanc...

متن کامل

Boosted Regression (Boosting): An introductory tutorial and a Stata plugin

Boosting, or boosted regression, is a recent data mining technique that has shown considerable success in predictive accuracy. This article gives an overview over boosting and introduces a new Stata command, boost, that implements the boosting algorithm described in Hastie et al. (2001, p. 322). The plugin is illustrated with a Gaussian and a logistic regression example. In the Gaussian regress...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Logistic Regression and Boosting for Labeled Bags of Instances

نویسندگان

چکیده

منابع مشابه

Reducing Annotation Effort using Generalized Expectation Criteria

Multiple Instance Metric Learning from Automatically Labeled Bags of Faces

Multiple Instance Learning with Query Bags

Review of Multi-Instance Learning and Its applications

Boosted Regression (Boosting): An introductory tutorial and a Stata plugin

عنوان ژورنال:

اشتراک گذاری